人的大脑位于复杂的神经生物学系统的核心,神经元,电路和子系统以神秘的方式相互作用。长期以来,了解大脑的结构和功能机制一直是神经科学研究和临床障碍疗法的引人入胜的追求。将人脑作为网络的连接映射是神经科学中最普遍的范例之一。图神经网络(GNN)最近已成为建模复杂网络数据的潜在方法。另一方面,深层模型的可解释性低,从而阻止了他们在医疗保健等决策环境中的使用。为了弥合这一差距,我们提出了一个可解释的框架,以分析特定的利益区域(ROI)和突出的联系。提出的框架由两个模块组成:疾病预测的面向脑网络的主链模型和全球共享的解释发生器,该模型突出了包括疾病特异性的生物标志物,包括显着的ROI和重要连接。我们在三个现实世界中的脑疾病数据集上进行实验。结果证明了我们的框架可以获得出色的性能并确定有意义的生物标志物。这项工作的所有代码均可在https://github.com/hennyjie/ibgnn.git上获得。
translated by 谷歌翻译
大脑网络将大脑区域之间的复杂连接性描述为图形结构,这为研究脑连接素提供了强大的手段。近年来,图形神经网络已成为使用结构化数据的普遍学习范式。但是,由于数据获取的成本相对较高,大多数大脑网络数据集的样本量受到限制,这阻碍了足够的培训中的深度学习模型。受元学习的启发,该论文以有限的培训示例快速学习新概念,研究了在跨数据库中分析脑连接组的数据有效培训策略。具体而言,我们建议在大型样本大小的数据集上进行元训练模型,并将知识转移到小数据集中。此外,我们还探索了两种面向脑网络的设计,包括Atlas转换和自适应任务重新启动。与其他训练前策略相比,我们的基于元学习的方法实现了更高和稳定的性能,这证明了我们提出的解决方案的有效性。该框架还能够以数据驱动的方式获得有关数据集和疾病之间相似之处的新见解。
translated by 谷歌翻译
Mapping the connectome of the human brain using structural or functional connectivity has become one of the most pervasive paradigms for neuroimaging analysis. Recently, Graph Neural Networks (GNNs) motivated from geometric deep learning have attracted broad interest due to their established power for modeling complex networked data. Despite their superior performance in many fields, there has not yet been a systematic study of how to design effective GNNs for brain network analysis. To bridge this gap, we present BrainGB, a benchmark for brain network analysis with GNNs. BrainGB standardizes the process by (1) summarizing brain network construction pipelines for both functional and structural neuroimaging modalities and (2) modularizing the implementation of GNN designs. We conduct extensive experiments on datasets across cohorts and modalities and recommend a set of general recipes for effective GNN designs on brain networks. To support open and reproducible research on GNN-based brain network analysis, we host the BrainGB website at https://braingb.us with models, tutorials, examples, as well as an out-of-box Python package. We hope that this work will provide useful empirical evidence and offer insights for future research in this novel and promising direction.
translated by 谷歌翻译
图形神经网络(GNNS),作为一组强大的表示对不规则数据学习的强大工具,在各种下游任务中表现出优越性。具有表示为概念映射的非结构化文本,可以针对文档检索等任务来利用GNN。呼吸GNNS如何帮助文档检索,我们对大型多学科数据集电源线19进行实证研究。结果表明,我们提出的语义导向图函数的基于BM25检索的候选人,而不是杜松子酒和GAT等复杂的结构导向GNN,而不是杜松子酒和GATS,而不是基于BM25检索到的候选者实现更好且更稳定的性能。我们在本案例研究中的见解可以作为未来工作的指导准则,以便为文档检索和分类等文本推理任务提供适当的语义导向的归纳偏差。此案例研究的所有代码都可以在https://github.com/hennyjie/gnn-docrocrocal中获得。
translated by 谷歌翻译
图形神经网络(GNN)已被广泛用于各种与图形有关的问题,例如节点分类和图形分类,在可用的天然节点特征时,主要的性能主要建立。但是,没有天然节点特征,尤其是在构造人造的各种方式方面,GNNS的工作方式尚不清楚。在本文中,我们指出了两种类型的人工节点特征,即位置和结构节点特征,并提供有关为什么每个任务更适合某些任务的洞察力,即位置节点分类,结构节点分类以及图形,以及图形。分类。10个基准数据集的广泛实验结果验证了我们的见解,因此导致了对非属性图上GNN的不同人工节点特征之间选择的实际指南。该代码可在https://github.com/zjzielu/gnn-positional-sstructural-node-features上获得。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
When using LiDAR semantic segmentation models for safety-critical applications such as autonomous driving, it is essential to understand and improve their robustness with respect to a large range of LiDAR corruptions. In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions. To rigorously evaluate the robustness and generalizability of current approaches, we propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy. Then, we systematically investigate 11 LiDAR semantic segmentation models, especially spanning different input representations (e.g., point clouds, voxels, projected images, and etc.), network architectures and training schemes. Through this study, we obtain two insights: 1) We find out that the input representation plays a crucial role in robustness. Specifically, under specific corruptions, different representations perform variously. 2) Although state-of-the-art methods on LiDAR semantic segmentation achieve promising results on clean data, they are less robust when dealing with noisy data. Finally, based on the above observations, we design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications. It is promising that our benchmark, comprehensive analysis, and observations can boost future research in robust LiDAR semantic segmentation for safety-critical applications.
translated by 谷歌翻译
Denoising Diffusion Probabilistic Models (DDPMs) are emerging in text-to-speech (TTS) synthesis because of their strong capability of generating high-fidelity samples. However, their iterative refinement process in high-dimensional data space results in slow inference speed, which restricts their application in real-time systems. Previous works have explored speeding up by minimizing the number of inference steps but at the cost of sample quality. In this work, to improve the inference speed for DDPM-based TTS model while achieving high sample quality, we propose ResGrad, a lightweight diffusion model which learns to refine the output spectrogram of an existing TTS model (e.g., FastSpeech 2) by predicting the residual between the model output and the corresponding ground-truth speech. ResGrad has several advantages: 1) Compare with other acceleration methods for DDPM which need to synthesize speech from scratch, ResGrad reduces the complexity of task by changing the generation target from ground-truth mel-spectrogram to the residual, resulting into a more lightweight model and thus a smaller real-time factor. 2) ResGrad is employed in the inference process of the existing TTS model in a plug-and-play way, without re-training this model. We verify ResGrad on the single-speaker dataset LJSpeech and two more challenging datasets with multiple speakers (LibriTTS) and high sampling rate (VCTK). Experimental results show that in comparison with other speed-up methods of DDPMs: 1) ResGrad achieves better sample quality with the same inference speed measured by real-time factor; 2) with similar speech quality, ResGrad synthesizes speech faster than baseline methods by more than 10 times. Audio samples are available at https://resgrad1.github.io/.
translated by 谷歌翻译
Crowdsourcing, in which human intelligence and productivity is dynamically mobilized to tackle tasks too complex for automation alone to handle, has grown to be an important research topic and inspired new businesses (e.g., Uber, Airbnb). Over the years, crowdsourcing has morphed from providing a platform where workers and tasks can be matched up manually into one which leverages data-driven algorithmic management approaches powered by artificial intelligence (AI) to achieve increasingly sophisticated optimization objectives. In this paper, we provide a survey presenting a unique systematic overview on how AI can empower crowdsourcing - which we refer to as AI-Empowered Crowdsourcing(AIEC). We propose a taxonomy which divides algorithmic crowdsourcing into three major areas: 1) task delegation, 2) motivating workers, and 3) quality control, focusing on the major objectives which need to be accomplished. We discuss the limitations and insights, and curate the challenges of doing research in each of these areas to highlight promising future research directions.
translated by 谷歌翻译
Fine-grained classification and counting of bone marrow erythroid cells are vital for evaluating the health status and formulating therapeutic schedules for leukemia or hematopathy. Due to the subtle visual differences between different types of erythroid cells, it is challenging to apply existing image-based deep learning models for fine-grained erythroid cell classification. Moreover, there is no large open-source datasets on erythroid cells to support the model training. In this paper, we introduce BMEC (Bone Morrow Erythroid Cells), the first large fine-grained image dataset of erythroid cells, to facilitate more deep learning research on erythroid cells. BMEC contains 5,666 images of individual erythroid cells, each of which is extracted from the bone marrow erythroid cell smears and professionally annotated to one of the four types of erythroid cells. To distinguish the erythroid cells, one key indicator is the cell shape which is closely related to the cell growth and maturation. Therefore, we design a novel shape-aware image classification network for fine-grained erythroid cell classification. The shape feature is extracted from the shape mask image and aggregated to the raw image feature with a shape attention module. With the shape-attended image feature, our network achieved superior classification performance (81.12\% top-1 accuracy) on the BMEC dataset comparing to the baseline methods. Ablation studies also demonstrate the effectiveness of incorporating the shape information for the fine-grained cell classification. To further verify the generalizability of our method, we tested our network on two additional public white blood cells (WBC) datasets and the results show our shape-aware method can generally outperform recent state-of-the-art works on classifying the WBC. The code and BMEC dataset can be found on https://github.com/wangye8899/BMEC.
translated by 谷歌翻译